Journal of New Music Research 
2008, Vol. 37, No. 1, pp. 1-13 


R 


Routledge 

Taylor & Francis Croup 


An Automatic Pitch Analysis Method for Turkish Maqam Music 


Bari§ Bozkurt 

Izmir Institute of Technology, Turkey 


Abstract 

Automatic pitch analysis of large audio databases is 
essential for studies on music information retrieval and 
developing a pitch scale theory for Turkish maqam 
music. However no such study is available. In this article, 
we first determine the main obstacle as the alignment of 
frequency analysis results from multiple files. We then 
propose a new method to automatically detect the tonic 
of a recording, align the data, and estimate overall 
frequency histograms from large databases. We show 
that such histograms can be successfully used for pitch 
scale (tuning) studies on the recordings of Tanburi Cemil 
Bey, an undisputed master of the genre. 


1. Introduction 

Although the “maqam world” corresponds to a 
regionally very large multicultural area (Touma, 1971; 
Powers, 1988; Zannos, 1990), computational methods 
aiming to process maqam music is very limited in number. 
This study targets development of fully automatic 
methods for frequency analysis of large audio databases 
of maqam music. In this study, for practical reasons, the 
discussions and data are limited to Turkish maqam 
music. However the basic methods presented are poten¬ 
tially applicable to other maqam music traditions. 

Development of computerized music analysis methods 
necessitates use of music theory. Similarly, music theory 
research can also profit highly from use of computer- 
based analysis especially for traditional music (Tzanetakis 
et al., 2007). Today, for Turkish maqam music studies, it 
appears that computerized analysis can be a very 
important tool for progress in music theory research. 

A maqam generally implies a set of rules for 
composition and improvisation. These rules are defined 


in various dimensions: the pitch scale, melodic progres¬ 
sion defining overall ascending-descending characteris¬ 
tics, temporary tonics, possible modulations to other 
maqams, etc. For an in-depth analysis of maqam music, 
computerized methods are needed that can provide a 
large set of data for each of these dimensions. This study 
aims at providing automatic analysis tools only for the 
pitch scale dimension of the problem. 

It is well known that developing the pitch scale theory 
for Turkish maqam music is still an open issue. Many 
studies mention that there exist mismatches between the 
intervals measured on recordings from master musicians 
and those specified in the Arel-Ezgi-Uzdilek (AEU) 
theory (Arel, 1930) although this theory is widely 
accepted and taught. 1 New studies continue to emerge 
which compare old scales with actual intervals being 
played and propose new formulations for more efficient 
representations of pitch scales (i.e. maqam music notes 
called “perde” s) (Karaosmanoglu & Akko§, 2003; 
Tulgan, 2007; Yarman, 2008). 

It is clear that collecting large audio databases and 
analysing them in a systematic way is very crucial for 
such studies. However the common approach used in 
most of the studies is to analyse a very limited number of 
examples from well-known musicians and draw general 
conclusions from these limited data. This choice in 
methodology stems from some difficulties related to 
the nature of maqam music: it is well known 
in maqam music theory that notes do not correspond 
to standardized single fixed frequencies. In addition to 
the fact that there are 12 possible diapasons called 


*A recent congress was completely dedicated to this topic: 
"Theory-application mismatch for Turkish Music: Problems 
and Solutions”, organized by Istanbul Technical University, 
State Conservatory for Turkish Music, 3-6 March 2008, 
Istanbul. 
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“ ahe/ik”s 2 and dozens of maqams employing a large 
variety of intervals, musicians may differently interpret 
certain degrees of some scales. 2 3 Therefore, it is not at all 
trivial to develop computerized methods that can perform 
automatic analysis and gather results from large data¬ 
bases. The lack of such methods continues to hamper 
research into the actual tuning of Turkish maqam music 
and the development of a theory conforming to practice. 

Furthermore, a literature study of maqam music 
shows that music information retrieval (MIR) techniques 
developed for maqam music is very much limited in 
number. The problems mentioned above stays at the 
heart of many issues in MIR studies. 

This study proposes a novel method for aligning pitch 
frequencies computed on various recordings in order to 
facilitate automatic analysis of large databases. The 
alignment is performed via matching the tonics of each 
recording. To achieve this, a new tonic detection 
algorithm for maqam music is presented. 

It should be noted that for direct alignment with tonic 
as reference, the recordings should be from the same class 
of maqams that have the same note as tonic. In terms of 
their tonics, maqams can be classified into several broad 
classes, those with the tonic note being: rast, diigah, 
segah, irak, yegah, acem agiran, hiiseyni agiran, gevegt, 
buselik, pargah 4 (Karadeniz, 1965). The largest set is that 
of maqams with diigah as tonic, followed by rast and 
segah. From tables provided in £evikoglu (2007), the 
percentage of songs in the repertory of TRT (Turkish 
Radio Television Broadcasting) can roughly (by comput¬ 
ing percentage on the list provided for 72% of the whole 
repertory) be calculated as: 44.7% for maqams with 
diigah tonic, 39.2% for maqams with rast tonic and 
11.8% for maqams with segah tonic. This means that, if 
recordings from these three classes (maqams with tonic 
as diigah, rast or segah) can be aligned and processed 
together, one could cover more than 90% of recordings. 
To avoid confusion and save space, we have chosen 
recordings only from maqams with diigah tonic as test 


2 Ahenk is synonymous with key transposition or diapason. Due 
to the fact that maqam music notes (“perde” s) are named by 
reference to their relative positions, the same perde corresponds 
to different pitches on different sizes of instruments. The reader 
is referred to Erguner (2007) and Appendix B of Yarrnan (2008) 
for detailed reviews on ahenks, ney fingerings, location of holes 
on neys with various lengths and the corresponding pitch 
frequencies produced. 

3 For example for fretted instruments there is a lack of standards 
and defining appropriate fret locations is an open topic of 
research. It is observed that the number of frets and their 
locations vary regionally or due to personal choice. 

4 It should be noted that some names are used both for makams 
and notes(“pm/e”s), like: Hicaz, Buselik, Kiirdi, Segah, Mahar, 
etc. For discriminating the two, maqam names are written with 
the first letter being capital. 


data in the application part of this study. But the system 
can be successfully applied to other classes in a 
straightforward way. 

To demonstrate the effectiveness of the method, we 
present an application of the methods to recordings of 
Tanburi Cemil Bey (1871-1916). There are various 
reasons for this choice. Firstly, Tanburi Cemil Bey is 
the most commonly referred musician for the correctness 
and precision of his pitch intervals. Many talented 
musicians claim that his recordings are the most valuable 
source of information for maqam music (Tannkorur, 
2004). He has mastered many instruments including 
tanbur, kemen 5 e, lute, tar, cello, violin, kanun, clarinet, 
zurna and baglama. He is known as the creator of the 
bowed tanbur (yayli tanbur) and was an indisputable 
virtuoso of kemen£e and tanbur. Although he was 
mainly known as a performer he was an important 
composer too. His compositions are still being played not 
only in Turkey but also in Iran, Iraq, Syria, Lebanon, 
Egypt, Tunisia, Greece and the Balkans. For further 
information the reader is referred to Cemil (1947). 

In addition, the author thinks that Tanburi Cemil 
Bey’s recordings, being among the rather old recordings 
of maqam music, represent the traditional pitch frequen¬ 
cies taught through oral education without much 
influence from the official Arel-Ezgi-Uzdilek (AEU) 
theory (Arel, 1930). It can be easily seen in recent 
instrumental methods that tables on fretting reflect more 
the dictates of the AEU tuning than the demands of 
actual practice. A similar predicament is also true for 
formal education; the conservatories teach the AEU 
system. However, mismatches between theory and actual 
practice are also acknowledged. Through the tests 
performed on Tanburi Cemil Bey’s recordings for our 
proposed methods, we show that “theory-practice” 
mismatches can be directly observed and scrutinized on 
the average histograms automatically obtained. 

Throughout the study, the YIN algorithm (de 
Cheveigne & Kawahara, 2002) is used for fundamental 
frequency (fO) estimation together with some post-filters 
designed specially for Turkish maqam music. These post¬ 
filters are explained in Section 2. All recordings used are 
monophonic to avoid the complex multi-pitch estimation 
problem since it is not the main issue in our study. In 
Section 3, we discuss fO histogram computation. It is 
common practice to use Holdrian comma (He) as the 
smallest intervallic unit in Turkish maqam music theory. 
To facilitate comparisons with other studies, we also use 
the Holdrian comma unit in most of our figures and 
tables. Section 4 is dedicated to the presentation of our 
proposed method for combining histograms of multiple 
recordings and our tonic detection algorithm. Section 5 
includes the discussions and conclusions. In addition, the 
AEU system which is also explained in Yarrnan (2007) to 
be equivalent to the Yekta-24 system in terms of intervals 
used is very briefly explained in Appendix A. The audio 
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material used is referred to at various steps in plots and 
discussions. We preferred to present the brief description 
of the recording database used in Appendix B so as not 
to interrupt the natural flow of the text. 


2. FO post-filtering for Turkish maqam music 
recordings 

Fundamental frequency estimation methods are not free 
of errors. Depending on the acoustic characteristics, 
various methods have various error rates. It is common 
practice to design post-filtering methods to correct some 
of these errors. For some special cases as ours, post¬ 
filters may be designed considering the melodic char¬ 
acteristics of a particular type of music. For Turkish 
maqam music we have verified that the post-filters below 
can be effectively used. 

Filters which impose continuity with the assumption 
that the following are not likely in Turkish maqam music: 

(1) low signal energy and short duration pitch chunks 
with boundary intervals larger than a perfect fifth; 

(2) relatively short duration chunk at an integer multi¬ 
ple octave higher or lower in pitch compared to a 
long duration previous or post chunk; 

(3) melodic dynamic range being larger than four 
octaves. 

The first two properties are implemented via conditional 
code lines. The last one is realized as a filter by: 
computing the mean of frequencies measured for a 
recording, defining maximum and minimum possible 
frequencies as two octaves higher and lower than the 
mean for that specific recording and filtering out the 



(a) Singing 


estimates falling outside this range. It is observed that 
such simple post-filters can correct an important portion 
of errors for some recordings. Two examples are 
provided in the Figure 1. 

These filters are actually very much needed for old 
recordings like Tanburi Cemil Bey’s due to the high level 
of background noise causing relatively high fO estimation 
errors. 


3. Bin-width selection for fO histogram analysis 

Use of histograms for analysis of pitch frequencies is a 
common practice (Akkoi;, 2002; Karaosmanoglu & 
Akko§, 2003; Zeren, 2003; Karaosmanoglu, 2004). The 
bin-width selection problem, although very rarely men¬ 
tioned in the literature, appears to be a critical issue for 
computer based analysis of fO histograms that may 
include peak detection. 

A pitch histogram, Hf 0 [n\, is a mapping that cor¬ 
responds to the number of fO values that fall into various 
disjoint categories (known as bins): 

K 

Hfo[n] =^m kl 

k= 1 

m k ~ 1, f n <fo[k] <f„+i, 
m k = 0, otherwise, 

where are boundary values defining the fO range 

for the nth bin. 

The choice of the bin-width (f„ + , - /„), the width of 
each category, defines the resolution of the histogram. In 
theoretical pitch scale studies, it is common practice to 
use uniform sampling of the whole fO range. Given the 



(b) Tanbur 


Fig. 1. Post-filtering examples: (a) singing, (b) tanbur. 
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Fig. 2. Flistograms obtained with (a) 1/3 FIc grid and (b) 1/9 Flc grid. 


number of bins, N, and fO range (/omax and / 0min ) bin- 
width, W b , is obtained simply by the following: 

TT r ,/omax ,/Omin 

Wb = - N -' 

fn = /Omin + (m - 1 )W b . 

One of the critical choices made in histogram computa¬ 
tion is the decision of bin-width when automatic methods 
are concerned. In automatic processing of histograms, 
peak detection is one of the basic operations; therefore our 
choice targets facilitation of the detection of note peaks. 

A fine grid, i.e. small bin-width, has an advantage in 
terms of precision but is disadvantageous for automatic 
peak picking since spurious peaks are produced. This is 
vice versa for a coarse grid, i.e. large bin-width. As an 
example we present, in Figure 2, two histograms with 
two different bin-widths on the same fO data. 

As can be seen on Figure 2, a simple peak picking 
algorithm would find more than twice as many peaks in 
the 1/9 He grid compared to the 1 /3 He grid. As a result of 
empirical tests with various grid sizes, we decided to use 
the 1/3 He resolution, a value that optimizes smoothness 
and precision of pitch histograms. In addition, this is the 
highest precision we could find in theoretical pitch scale 
studies (as used in Yarman (2008)). In all the pitch 
histograms used in this study, the pitch frequencies 
measured are indicated as intervals in Holdrian commas 
with 1/3 He precision with respect to the tonic. 


4. A method for combining histograms of 
multiple files 

In studies concerning pitch scales of Turkish maqam 
music, each recording’s histogram is studied separately 
and manually due to the difficulty of gathering analysis 
results in a systematic way as mentioned previously. In 
the literature, we could only find one software that 
facilitates the user to combine results from more than one 


recording: “7 era Analizi ”. 5 In “/era Analizi ”, the user can 
manually input “calibration” values to align the results 
from various recordings and derive histograms of played 
intervals for more than a single recording, “/era analizi ” 
is no doubt an important step towards tackling the 
problem. However it is clear that processing large data¬ 
bases necessitates manual work (finding the correct 
calibration value and typing in the system) which is very 
cumbersome. 

Due to the tuning problem mentioned in the introduc¬ 
tion, it is more convenient to study intervals rather than 
exact pitches as preferred in various studies. Two types of 
intervals typically are used: interval with respect to the 
previous note played and interval with respect to the tonic 
of the maqam. The first one has the potential to capture 
musical context dependent intervals played: the intervals 
played in descending and ascending melodic lines can be 
segregated and studied. However it is practically difficult 
to implement such automatic algorithms robust to 
ornamentations like glissandos and vibratos frequently 
used in maqam music (see Figure 12, Appendix A, for 
example). Due to practical reasons, we use the second as a 
means to “align” pitch histograms, i.e. the tonic of the 
maqam serves as an alignment point for two pitch 
histograms to be combined. All pitches are computed as 
intervals with respect to tonic and counted independent of 
the time of occurrence or musical context to form 
histograms that can be further combined. The payback 
is the loss of the time dimension and the link of intervals 
to the musical context. 

For this process to be automatic, a tonic detection 
algorithm is needed. In the next section we present our 
novel tonic detection algorithm. 

4.1 Automatic tonic detection 

For recordings with high signal-to-noise ratio, detection 
of the tonic is very trivial: in theory, a recording in a 


5 http://www.musiki.org/icra_analizi.htm 
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specific maqam always ends at the tonic as the last note 
(Akdogu, 1989). We have implemented a simple algo¬ 
rithm to capture the last note and tested on various 
recordings. This approach works quite well when the 
background noise is rather low. 

However in old recordings such as Tanburi Cemil 
Bey’s recordings, the energy of background noise some¬ 
times exceeds the energy of the musical signal and the 
estimated frequencies become unreliable especially to¬ 
wards the end. For this reason a more robust method is 
developed for tonic detection with the assumption that 
the maqam of the recording is known (either from the tags 
or track names since it is common practice to name tracks 
with the maqam name: “Hicaz taksim ”, for example). For 
tonic detection in such adverse situations, maqam 
histogram templates can be utilized as reference. Briefly, 
our algorithm detects the tonic of a given recording by 
aligning its histogram with its maqam histogram template 
which is initiated as a Gaussian mixture from theoretical 
intervals specified in the AEU system and updated 
recursively as multiple recordings are aligned. We present 
the complete process flow (starting from wave files), the 
tonic detection algorithm and the histogram template 
construction algorithm in Figure 3. 

Given the intervals defined in the AEU system for a 
specific maqam, a simple theoretical histogram template 
in the Holdrian commas scale, H( f 0 ), for the given 
maqam can be constructed as a mixture of Gaussian 
functions, G k (f 0 ), 


H{.fo) = £ OkGk(fo) 
Gkifo) = exp 


k=\ 


ncf fo~fk\ 

~ 0 ' 5 Ttyl) 


where f k is the centre frequency of the Gaussian function 
defined by the ki\\ degree’s interval with respect to tonic. 


a is the reciprocal of the standard deviation. f k ,fo and W b 
are discrete frequency values in He with 1/3 He 
resolution. 

Assigning the tonic to 0 He, the intervals specify the 
centre frequencies of the notes, f k , which correspond to 
the mean of each Gaussian distribution function. For 
simplicity the weights, a k , of each distribution are 
assigned to be equal and this value is set to the maximum 
value of the histogram of a given recording for which the 
tonic will be found. The variances of each distribution 
are also set to be equal and it is a user-defined value to set 
the width of each Gaussian. 

As an example, we present a histogram template for 
maqam Hicaz from the intervals defined in the AEU 
system in Figure 4. 

We have chosen the AEU system intervals for 
template construction since it is “the official/standar¬ 
dized system” despite its errors in representing the 
practice. We leave the comparison to use of other 
tunings for this specific application to future work. Only 
the initial histogram template is constructed theoreti¬ 
cally. The template is updated within a loop of 
optimization and all templates except the initial are 
obtained by averaging the real recordings’ pitch histo¬ 
grams after alignment. The use of histogram averaging 
results in discarding relatively rare musical events and 
construction of a representation of what is common as 
pitch intervals in a large database. This is convenient for 
our target but causes a loss of fine details which may be 
critical in characterizing a given maqam. Certain degrees 
of certain maqams’ pitch scales vary in pitch depending 
on the melodic line being ascended or descended. Such 
variations are observed to be relatively small (within less 
than 1.5 He range) and in average histograms these 
variations result in a widening of a peak width instead of 
creating additional peaks. On the other hand, it is 
preferable to filter out some of the details like small 
peaks due to temporary modulations to other maqams 


Recordings(l,..N) Intervals in the AEU system Final histogram template 



Fig. 3. Tonic detection and histogram template construction algorithm (box indicated with dashed lines) and the overall analysis 
process. All recordings should be in a given maqam which also specifies the intervals in the AEU system. 
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Fig. 4. Theoretical Hicaz maqam histogram template. The AEU 
system scale intervals: [0 5 17 22 31 35 44 53] He (Table 1, 
Appendix A). 


which may lead to confusion in the pitch scale analysis of 
a specific maqam. This is especially problematic when 
automatic peak detection algorithms are used for 
analysis of the histograms. 

Once the initial template of the maqam is avail¬ 
able, automatic tonic detection of a given recording is 
achieved by: 

- computing the cross-correlation function between the 
template and the pitch histogram of the recording; 

- finding the maximum correlation point; 

- aligning the template and the actual histogram; 

- assigning the first peak to tonic. 

These steps are represented as two blocks (Alignment, 
Tonic Detection) in Figure 3. 

The cross-correlation function, c[n], for given two real 
valued signals x[n\ and y[n] can simply be computed 
using the equation: 

| K -1 

= ^Z x \ k \y\ n+k \- 
A &=0 

Computing c[«] and finding the index of the maximum 
peak, one can estimate the optimum amount of shift to 
be applied on one of the signals such that the two 
signal waveforms match the most. This is illustrated 
in Figure 5 for the track: Vol. 2, Track 16, Hicaz 
Taksim. 

The index of the maximum of cross-correlation 
function (marked with a circle in Figure 5(a)) gives us 
the amount of shift to be applied to the template signal 
for alignment of the two histogram signals as in 




(b) 


Fig. 5. Aligning the theoretical Hicaz histogram template 
(dashed lines) and the pitch histogram of a recording (Vol. 2, 
Track 16, Hicaz Taksim). (a) Cross-correlation function of the 
theoretical histogram template and the pitch histogram of the 
recording, (b) Sychronized plot for the theoretical histogram 
template (dashed lines) and the pitch histogram of the recording 
(solid lines). 

Figure 5(b). The location of the first Gaussian’s peak is 
labelled as tonic and mapped to 0 He. 

The algorithm described is applied on the recordings 
described in Section 5 and compared with manually 
labelled tonics. We present four additional samples in 
Figure 6. 

Even for maqams for which the theory is criticized 
(examples can be found in Tulgan (2007)) for not 
matching exactly the intervals of certain degrees (like 
maqam Saba, Figure 6(d)), tonic detection is successfully 
performed. The degrees misrepresented in theory are 
clearly seen on histograms and support the criticism. For 
example in Figure 6(d), the 4 degree note peak does not 
exactly match with the theoretical Saba maqam histo¬ 
gram template’s 4 degree’s peak (corresponding to the 
note “hicaz”) as stated in Tulgan (2007). 

The tonic detection algorithm is applied on 67 
recordings and results for five recordings were proble¬ 
matic. In all other recordings, the tonic was successfully 
found within + 2/3 He precision. For those five record¬ 
ings, it has been observed that the repetitions in intervals 
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\vols.2_3disc1\02_cecenkizihuseyni_tanbur.wav 



(a) 


\vols.2_3disc2\O4_tahirbuselikpesr0v_k.wav 



(b) 


\vols.2 3disc1\D5 rasttaksim 1anbur.wav 



(c) 


\voIs.2_3disc3\02 saba.wav 



(d) 


Fig. 6. Histogram matching examples for tonic detection. Maqams of the recordings are: (a) Hiiseyni , (b) Tahir Buselik , (c) Rast , (d) 
Saha. 


resulted in wrong tonic detection. In addition, these 
recordings included solos from different instruments 
(singing and tanbur interchanging solo sequences) which 
result in addition of fO-regions that intersect but not 
match exactly in histograms. This typically causes 
changes in highest peak location and fO-data being 
concentrated on a rather larger fO-range. For this reason, 
we think that the algorithm is more reliable when applied 
to solo recordings of a single instrument such as taksims 
(which is in fact a very common improvisational form of 
Turkish maqam music; taksims are also the main 
material used in music theory research). We leave 
comprehensive testing for various forms to our future 
work. 

Once tonics are detected, histograms of all recordings 
are averaged to obtain a new histogram template (nth 
iteration). Fortunately, the contribution of rare tonic 
detection errors in the template histogram update in the 
loop is rather negligible. This is illustrated in Figure 7, an 
example where the algorithm is applied to six recordings 


in maqam U§§ak. For the last recording, the tonic is 
misdetected (Figure 7(a)) and the nth template is 
computed accumulatively for demonstration, the last 
addition being the misdetected example’s histogram. 

In Figure 7(a), we observe that the tonic is located 
close to the middle of the most frequent fO-range spanned 
which is not very likely for single instrument playing. 
This example is a singing-tanbur duo where each 
instrument plays solo in turn. The tanbur’s tonic is the 
labelled peak and the singer’s tonic is the left-most peak 
where a misdetected tonic is located at 0 He. In 
Figure 7(b), we present the accumulation of the new 
histogram template from recording 1 to 6. The last 
addition is performed with the example presented in 
Figure 7(b) and its contribution is plotted with dotted 
lines. It is clear that the general shape is not much 
altered, the main change being a sharpening of the peak 
at 31 He since it is the largest peak in the added 
histogram. The template histogram can be used in 
theoretical studies on pitch scales perfectly. 
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Vols.2 3disc5\14 ussak k.wav 



(a) 



(b) 


Fig. 7. (a) Tonic misdetection example (Vol. 5, Track 14, Uffak) (correct tonic is labelled with a circle) and (b) the 2nd template 
computed accumulatively where the last contribution in dots come from the histogram in (a). 


It is shown in Figure 3 that the algorithm updates the 
template histogram in a loop: all recordings are aligned 
with a given template to form a new template by 
averaging, and the new template is again used to align 
histograms. The loop continues until the template com¬ 
puted in the («-l)th cycle is the same with the template 
computed in the «th cycle. Since histograms are discrete 
signals with 1/3 He resolution, a few loops are sufficient to 
reach the termination. In Figure 8, the templates com¬ 
puted for the six recordings in maqam U§§ak are shown 
(iteration starts on the theoretical template with the 
dashed lines). 

For this example, there is a mismatch between 
theoretical template peaks and estimated final template: 


the peak just after the tonic at 0 He. All studies 
discussing the weak points of the AEU theory, without 
any exception, state the mismatch of the theoretical and 
performed intervals for this second degree in U§§ak 
maqam’s pitch scale. Since the final template is obtained 
as the mean of aligned histograms, the error in theory is 
not reflected in the final histogram. Theory only serves 
for initial alignment of histograms. For studies targeting 
construction of models from the data (i.e. using the 
analysis results of actual practice as the main reference 
in building the model), this is very crucial. One 
alternative to using theoretical intervals for constructing 
an initial histogram template is to provide the system a 
real recording histogram example with manually 
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labelled tonic. This would need to be done once for 
each maqam. 


4.2 Aligning histograms of different maqams 

In the previous section we have explained an automatic 
algorithm to compute histogram templates from record¬ 
ings for a given maqam. The templates computed can be 
successfully used in analysing intervals used in a specific 
maqam “in practice”. 

As we have mentioned in the introduction, tonic can 
successfully serve as a reference point for alignment of 


recordings from different maqams given that the tonic is 
the same. In Figure 9, we present aligned histograms of 
five maqams (with tonic diigah) from 17 recordings of 
Tanburi Cemil Bey. 

In Figure 9, we observe that most of the peaks 
match theoretical intervals. However for two peaks this 
is not the case. A peak observed at 6.66 He for the 
second degree of maqam Uppak does not match with 
the theoretical interval specified to be 8 He. In many 
studies (Ozkan, 1998; Qakar, 2004; Erguner, 2007; 
Tulgan, 2007) we find the following explanation for 
this second degree: the second degree of maqam Uyyak, 
the segah note, is played 1—1.5 He lower than indicated 



Fig. 8. The templates computed iteratively by averaging N (=6) files in each iteration. Iteration number indicated on the resultant 
template. 



Fig. 9. Tonic aligned and normalized histograms for recordings with diigah as tonic: 5 maqam template histograms superimposed 
(5 histograms computed from 17 recordings). Vertical dashed lines indicate the intervals specified in the AEU system (Table 1, 
Appendix A). 
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on the staff notation (which refers to 8 He as the 
interval). 

A similar observation can be made on the fourth 
degree of maqam Saba , namely the note hicaz. We 
observe in Figure 9 that the fourth peak of the maqam 
Saba template at 19 He does not match the theoretical 
interval specified. In Ozkan (1998), £akar (2004) and 
Tulgan (2007) the corresponding explanation is: the 
fourth degree of maqam Saba, the hicaz note, is 
performed at higher pitch than indicated on the staff 
notation. 

The mismatching intervals for the given maqams 
support the criticism. Other degrees’ peaks match the 
theoretical intervals with 1/3 He precision. This 
example shows that the constructed histogram templates 
can be successfully used for theoretical studies. Due to 
space considerations, results from other maqams (which 
lead to the same conclusions) are not discussed here. 


5. Discussions and conclusions 

This paper presented a new method to align histograms 
of different recordings of Turkish maqam music. Such a 
method is extremely useful for automatic processing of 
multiple files which was not available until now in the 
domain of Turkish maqam music research. There are 
various applications for such automatic processing. To 
name a few: music information retrieval applications like 
automatic maqam classification and automatic transcrip¬ 
tion, and tuning research. For example, in studies aiming 
at detection of the whole set of pitches for maqam music, 
such a tool is potentially very useful. In Turkish maqam 
music theory, how many tones in an octave is needed to 
represent the maqam scale theory properly is still an 
open question. In the literature various propositions can 
be found varying from 17 to 79 tones in an octave 
(Yarman, 2007, 2008). With the method presented, large 
databases including maqam recordings with the same 
tonic can be aligned and viewed together with intervals 
specified in various theories. It is one of our future goals 
to study results from recordings of Tanburi Cemil Bey 
and compare in detail the histogram templates obtained 
with the intervals specified in several other theories like 
the Tore-Karadeniz system (Karadeniz, 1965) in com¬ 
parison to the AEU system. Since average histograms 
carry information about commonly played pitches rather 
than details, they can be successfully used for automatic 
classification problems. In Gedik and Bozkurt (2008) it 
has been shown that high classification rates are obtained 
by simply using average histograms in a template 
matching framework for automatic classification. It can 
be misleading to process fine details of the average 
histograms and draw general conclusions or expect that 
such histograms reveal details about the nature of a given 
maqam. 


Our proposed method involved a novel automatic 
tonic detection algorithm. The tonic detection algorithm 
was tested on 67 recordings of Tanburi Cemil Bey and 
62 files’ tonics were correctly identified. The tests can be 
considered demonstration of the potential at this level. 
The algorithm needs to be evaluated through testing on 
large databases including recordings with various 
acoustic conditions, various musicians and various 
forms. We leave such heavy testing to future work. It 
is clear that fO estimation errors contribute to errors but 
the methodology we propose is independent of a specific 
fO estimation algorithm. Therefore the contribution of 
fO estimation errors in the average histogram computa¬ 
tion are not studied in detail either. Efficient post¬ 
processing techniques for reduction of fO estimation 
errors are also proposed in the first sections of this 
paper. 

In addition, there are certain deficiencies of the study 
to be mentioned. Our approach discards the time and 
melodic context varying characteristics which are essen¬ 
tial for maqam music. Despite these deficiencies, we 
think that the tools provided will be of high value for at 
least one of the most important dimensions of maqam 
music: the pitch scale theory. 

Finally, most of the techniques proposed in this study 
are not limited to Turkish music. We think they are 
potentially applicable to other traditional (modal) music 
recordings from Pakistan, Afghanistan, Iran, Egypt, 
Morocco, Tunisia, Algeria, Azerbaijan, etc. 
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Appendix A: Pitch scale intervals in the 
Arel-Ezgi-Uzdilek (AEU) 

The most commonly referred system for Turkish maqam 
music is the Arel-Ezgi-Uzdilek (AEU) system (Arel, 
1930; Ezgi, 1933) which is considered as the “official 
theory”. Most of the available theory and instrument 
method books use AEU as the basic system. It is shown 


in Yarman (2007) that the AEU system is also equivalent 
to Yekta’s (1922) system in terms of the intervals used. In 
the AEU system, the following basic intervals (in terms 
of Holdrian commas (He)) are used: 


Bakiye (B): 4 He 

Ku§iik Mticennep (S): 5 He 

Biiyiik Mticennep (K): 8 He 

Tanini (T): 9 He 

Artik ikili (A): 12 or 13 He 


The accidentals used in the AEU system are summarized 
in Figure 10. Each step corresponds to 1 He. 

For each maqam, a scale is presented in Arel (1930) 
and the intervals are provided in terms of the basic 
intervals listed above. From the staff notation of the 
scale, the intervals can be deduced and vice versa. For 
example, the scale for maqam Hiiseyni is specified with 
basic intervals: KSTTKST which corresponds to the 
following intervals in He with respect to the tonic: [8 13 
22 31 39 44 53]. Figure 11 presents the scale for makam 
Huseyni. 

In this notation system, whole tones are represented as 
T (Tanini - 9 He) and half tones are represented as B 
(Bakiye - 4 He). Starting from the tonic diigah (A), the 
first interval is a whole tone lowered by a 1 He sized 
bemol. The first interval is then computed to be of size 8 
He corresponding to K (Biiyiik Miicennep). The second 
interval is half tone plus 1 He, again due to the 1 He 
bemol of the first note, resulting in an interval size of 5 
He, S (Kiiqiik Miicennep). 

It is often useful to display theoretical intervals 
together with the estimated fO values to have an insight 
about frequency variations. In Figure 12 we present such 
an example. 


f # # f x 
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w> % \> \> \ 

Fig. 10. Accidentals used in the AEU system. 
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Fig. 11. The scale for maqam Huseyni according to the AEU 
system. 
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Fig. 12. An example pitch track for makam Hiiseyni (Vol. 4, Track 4, Huseyni Taksim , end part). 


Table 1. Some maqam scale intervals (with diigah as tonic) in AEU. 


Maqam Intervals in He with respect to the tonic 


Hicaz 

0 

5 



17 

22 

31 

35 


44 


53 

Rast 

0 


9 


17 

22 

31 


40 


48 

53 

Segah 

0 

5 


14 


22 

31 

36 



49 

53 

Kiirdili Hicazkar 

0 

4 


13 


22 

31 

35 


44 


53 

Huzzam 

0 

5 


14 

19 


31 

36 



49 

53 

Nihavend 

0 


9 

13 


22 

31 

35 


44 


53 

Huseyni 

0 


8 

13 


22 

31 


39 

44 


53 

Uggak 

0 


8 

13 


22 

31 

35 


44 


53 

Saha 

0 


8 

13 

18 


31 

35 


44 


53 


For maqam Huseyni, there is a theory-application 
mismatch for the first interval (marked on the figure). A 
similar characteristic is observed throughout a record¬ 
ing. For this reason, matches and mismatches between 
theory and application can be observed on pitch 
histograms as Figure 6(a) (the centre of the second 
peak differs slightly for the theoretical template and the 
actual histogram). Therefore it is very convenient to 
study theory-application mismatches using pitch histo¬ 
grams. Actually, many studies base their claims on 
observations of pitch histograms (Akko?, 2002; Zeren, 
2003; Karaosmanoglu & Akko§, 2003; Karaosmanoglu, 
2004). 

From these representations, one can deduce a list of 
intervals with respect to the tonic as presented in Table 1 
for some commonly used maqams. 

For automatic processing, using such tables as 
descriptors of maqam scales is very convenient. We 
utilize such tables in text format as complementary files 
in our implementations. 


Appendix B: Audio material 

For tests and plots, we have used 67 recordings as audio 
material from “Tanburi Cemil Bey: 1-5” Crossroads CD 
4264, remastered from the Orfeon 10498 original 78 rpm 
record (1910-1914). From a total of 73 tracks, six tracks 
were excluded due to various reasons: 

- the rotational speed variation in recording mechan¬ 
ism causing pitch shifts within a recording; 

- piano being used as accompanying instrument; 

- fO estimation problems due to too high background 
noise. 

The remaining material used for analysis contains 
performances in the following maqams: 

Maqams with diigah as tonic (27 recordings): 

U§§ak (6), Huseyni (4), Saba (3), Giilizar (2), Tahir 
Buselik (2), isfahan (2), Neva (1), Tahir (1), Bayati (1), 
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Kiirdi (1), §ehnaz (1), Hicaz (1), Ni§aburek (1), 
Muhayyer (1). 

Maqams with rast as tonic (16 recordings): 

Rast (2), Pesendide (1), Neveser (1), Nihavend (2), Mahur 
(3), Kurdili Hicazkar (2), Hicazkar (3), Suzinak (2). 

Maqams with segah as tonic (5 recordings): 

Segah (3), Miistear (1), Hiizzam (1). 

Maqams with irak as tonic (8 recordings): 

Irak (1), Bestenigar (3), Evi? (2), Ferahnak (2). 

Maqams with yegah as tonic (8 recordings): 

Yegah (3), Ferahfeza (3), §edaraban (2). 

Maqams with hiiseyni ay iron as tonic (2 recordings): 
Nuhuft (1), Suzidil (1). 


Maqams with acem ayiran as tonic (1 recording): 
§evkefza (1). 

On these recordings Tanburi Cemil Bey plays 
kemence (an unfretted instrument), tanbur (movable 
fretted instrument), yayli tanbur (tanbur played with a 
bow), lute and violin. It is acknowledged in various 
texts that Tanburi Cemil Bey used fewer number of 
frets (43 for two octave range whereas today’s 
tanburs have more than 50 frets) and adjusted fret 
locations time to time before starting a taksim on a 
specific makam. 

A few pre-processing operations were performed on 
recordings. Most of the recordings contain speech at the 
beginning as announcement of the content of recording. 
These portions of the recordings are silenced out. When 
necessary and possible, periodic noise is suppressed via 
frequency selective filters. 


